Skip to content

[AMDGPU] using loop to define data type convert patterns #132899

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 23 commits into
base: main
Choose a base branch
from

Conversation

Shoreshen
Copy link
Contributor

using loop to define data type convert patterns

@Shoreshen Shoreshen requested a review from arsenm March 25, 2025 09:06
@Shoreshen Shoreshen requested a review from krzysz00 March 25, 2025 09:06
@llvmbot
Copy link
Member

llvmbot commented Mar 25, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: None (Shoreshen)

Changes

using loop to define data type convert patterns


Patch is 1.80 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/132899.diff

17 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/SIInstructions.td (+210-322)
  • (modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.td (+7-4)
  • (added) llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll (+2394)
  • (added) llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.128bit.ll (+6084)
  • (added) llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.160bit.ll (+178)
  • (added) llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.16bit.ll (+556)
  • (added) llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.192bit.ll (+1062)
  • (added) llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.224bit.ll (+194)
  • (added) llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.256bit.ll (+9118)
  • (added) llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.288bit.ll (+209)
  • (added) llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.320bit.ll (+220)
  • (added) llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.32bit.ll (+1960)
  • (added) llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.352bit.ll (+228)
  • (added) llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.384bit.ll (+235)
  • (added) llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll (+15566)
  • (added) llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.64bit.ll (+4574)
  • (added) llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.96bit.ll (+163)
diff --git a/llvm/lib/Target/AMDGPU/SIInstructions.td b/llvm/lib/Target/AMDGPU/SIInstructions.td
index 900aed5b3f994..31b8b4254c4c4 100644
--- a/llvm/lib/Target/AMDGPU/SIInstructions.td
+++ b/llvm/lib/Target/AMDGPU/SIInstructions.td
@@ -1592,361 +1592,249 @@ foreach Index = 0-31 in {
 // FIXME: Why do only some of these type combinations for SReg and
 // VReg?
 // 16-bit bitcast
-def : BitConvert <i16, f16, VGPR_32>;
-def : BitConvert <f16, i16, VGPR_32>;
-def : BitConvert <f16, bf16, VGPR_32>;
-def : BitConvert <bf16, f16, VGPR_32>;
-
-def : BitConvert <i16, f16, SReg_32>;
-def : BitConvert <f16, i16, SReg_32>;
-def : BitConvert <f16, bf16, SReg_32>;
-def : BitConvert <bf16, f16, SReg_32>;
+foreach vt = Reg16Types.types in {
+  foreach st = Reg16Types.types in {
+    if !not(!eq (vt, st)) then {
+        def : BitConvert <vt, st, VGPR_32>;
+    }
+  }
+}
 
-def : BitConvert <i16, bf16, VGPR_32>;
-def : BitConvert <bf16, i16, VGPR_32>;
-def : BitConvert <i16, bf16, SReg_32>;
-def : BitConvert <bf16, i16, SReg_32>;
+foreach vt = Reg16Types.types in {
+  foreach st = Reg16Types.types in {
+    if !not(!eq (vt, st)) then {
+        def : BitConvert <vt, st, SReg_32>;
+    }
+  }
+}
 
 // 32-bit bitcast
-def : BitConvert <i32, f32, VGPR_32>;
-def : BitConvert <f32, i32, VGPR_32>;
-def : BitConvert <i32, f32, SReg_32>;
-def : BitConvert <f32, i32, SReg_32>;
-def : BitConvert <v2i16, i32, SReg_32>;
-def : BitConvert <i32, v2i16, SReg_32>;
-def : BitConvert <v2f16, i32, SReg_32>;
-def : BitConvert <i32, v2f16, SReg_32>;
-def : BitConvert <v2i16, v2f16, SReg_32>;
-def : BitConvert <v2f16, v2i16, SReg_32>;
-def : BitConvert <v2f16, f32, SReg_32>;
-def : BitConvert <f32, v2f16, SReg_32>;
-def : BitConvert <v2i16, f32, SReg_32>;
-def : BitConvert <f32, v2i16, SReg_32>;
-def : BitConvert <v2bf16, i32, SReg_32>;
-def : BitConvert <i32, v2bf16, SReg_32>;
-def : BitConvert <v2bf16, i32, VGPR_32>;
-def : BitConvert <i32, v2bf16, VGPR_32>;
-def : BitConvert <v2bf16, v2i16, SReg_32>;
-def : BitConvert <v2i16, v2bf16, SReg_32>;
-def : BitConvert <v2bf16, v2i16, VGPR_32>;
-def : BitConvert <v2i16, v2bf16, VGPR_32>;
-def : BitConvert <v2bf16, v2f16, SReg_32>;
-def : BitConvert <v2f16, v2bf16, SReg_32>;
-def : BitConvert <v2bf16, v2f16, VGPR_32>;
-def : BitConvert <v2f16, v2bf16, VGPR_32>;
-def : BitConvert <f32, v2bf16, VGPR_32>;
-def : BitConvert <v2bf16, f32, VGPR_32>;
-def : BitConvert <f32, v2bf16, SReg_32>;
-def : BitConvert <v2bf16, f32, SReg_32>;
+foreach vt = Reg32DataTypes.types in {
+  foreach st = Reg32DataTypes.types in {
+    if !not(!eq (vt, st)) then {
+        def : BitConvert <vt, st, VGPR_32>;
+    }
+  }
+}
+
+foreach vt = Reg32DataTypes.types in {
+  foreach st = Reg32DataTypes.types in {
+    if !not(!eq (vt, st)) then {
+        def : BitConvert <vt, st, SReg_32>;
+    }
+  }
+}
 
 
 // 64-bit bitcast
-def : BitConvert <i64, f64, VReg_64>;
-def : BitConvert <f64, i64, VReg_64>;
-def : BitConvert <v2i32, v2f32, VReg_64>;
-def : BitConvert <v2f32, v2i32, VReg_64>;
-def : BitConvert <i64, v2i32, VReg_64>;
-def : BitConvert <v2i32, i64, VReg_64>;
-def : BitConvert <i64, v2f32, VReg_64>;
-def : BitConvert <v2f32, i64, VReg_64>;
-def : BitConvert <f64, v2f32, VReg_64>;
-def : BitConvert <v2f32, f64, VReg_64>;
-def : BitConvert <f64, v2i32, VReg_64>;
-def : BitConvert <v2i32, f64, VReg_64>;
-def : BitConvert <v4i16, v4f16, VReg_64>;
-def : BitConvert <v4f16, v4i16, VReg_64>;
-def : BitConvert <v4bf16, v2i32, VReg_64>;
-def : BitConvert <v2i32, v4bf16, VReg_64>;
-def : BitConvert <v4bf16, i64, VReg_64>;
-def : BitConvert <i64, v4bf16, VReg_64>;
-def : BitConvert <v4bf16, v4i16, VReg_64>;
-def : BitConvert <v4i16, v4bf16, VReg_64>;
-def : BitConvert <v4bf16, v4f16, VReg_64>;
-def : BitConvert <v4f16, v4bf16, VReg_64>;
-def : BitConvert <v4bf16, v2f32, VReg_64>;
-def : BitConvert <v2f32, v4bf16, VReg_64>;
-def : BitConvert <v4bf16, f64, VReg_64>;
-def : BitConvert <f64, v4bf16, VReg_64>;
-
-
-// FIXME: Make SGPR
-def : BitConvert <v2i32, v4f16, VReg_64>;
-def : BitConvert <v4f16, v2i32, VReg_64>;
-def : BitConvert <v2i32, v4f16, VReg_64>;
-def : BitConvert <v2i32, v4i16, VReg_64>;
-def : BitConvert <v4i16, v2i32, VReg_64>;
-def : BitConvert <v2f32, v4f16, VReg_64>;
-def : BitConvert <v4f16, v2f32, VReg_64>;
-def : BitConvert <v2f32, v4i16, VReg_64>;
-def : BitConvert <v4i16, v2f32, VReg_64>;
-def : BitConvert <v4i16, f64, VReg_64>;
-def : BitConvert <v4f16, f64, VReg_64>;
-def : BitConvert <f64, v4i16, VReg_64>;
-def : BitConvert <f64, v4f16, VReg_64>;
-def : BitConvert <v4i16, i64, VReg_64>;
-def : BitConvert <v4f16, i64, VReg_64>;
-def : BitConvert <i64, v4i16, VReg_64>;
-def : BitConvert <i64, v4f16, VReg_64>;
-
-def : BitConvert <v4i32, v4f32, VReg_128>;
-def : BitConvert <v4f32, v4i32, VReg_128>;
+foreach vt = Reg64DataTypes.types in {
+  foreach st = Reg64DataTypes.types in {
+    if !not(!eq (vt, st)) then {
+        def : BitConvert <vt, st, VReg_64>;
+    }
+  }
+}
+
 
 // 96-bit bitcast
-def : BitConvert <v3i32, v3f32, SGPR_96>;
-def : BitConvert <v3f32, v3i32, SGPR_96>;
+foreach vt = SGPR_96.RegTypes in {
+  foreach st = SGPR_96.RegTypes in {
+    if !not(!eq (vt, st)) then {
+        def : BitConvert <vt, st, SGPR_96>;
+    }
+  }
+}
+
 
 // 128-bit bitcast
-def : BitConvert <v2i64, v4i32, SReg_128>;
-def : BitConvert <v4i32, v2i64, SReg_128>;
-def : BitConvert <v2f64, v4f32, VReg_128>;
-def : BitConvert <v2f64, v4i32, VReg_128>;
-def : BitConvert <v4f32, v2f64, VReg_128>;
-def : BitConvert <v4i32, v2f64, VReg_128>;
-def : BitConvert <v2i64, v2f64, VReg_128>;
-def : BitConvert <v2f64, v2i64, VReg_128>;
-def : BitConvert <v4f32, v2i64, VReg_128>;
-def : BitConvert <v2i64, v4f32, VReg_128>;
-def : BitConvert <v8i16, v4i32, SReg_128>;
-def : BitConvert <v4i32, v8i16, SReg_128>;
-def : BitConvert <v8f16, v4f32, VReg_128>;
-def : BitConvert <v8f16, v4i32, VReg_128>;
-def : BitConvert <v4f32, v8f16, VReg_128>;
-def : BitConvert <v4i32, v8f16, VReg_128>;
-def : BitConvert <v8i16, v8f16, VReg_128>;
-def : BitConvert <v8f16, v8i16, VReg_128>;
-def : BitConvert <v4f32, v8i16, VReg_128>;
-def : BitConvert <v8i16, v4f32, VReg_128>;
-def : BitConvert <v8i16, v8f16, SReg_128>;
-def : BitConvert <v8i16, v2i64, SReg_128>;
-def : BitConvert <v8i16, v2f64, SReg_128>;
-def : BitConvert <v8f16, v2i64, SReg_128>;
-def : BitConvert <v8f16, v2f64, SReg_128>;
-def : BitConvert <v8f16, v8i16, SReg_128>;
-def : BitConvert <v2i64, v8i16, SReg_128>;
-def : BitConvert <v2f64, v8i16, SReg_128>;
-def : BitConvert <v2i64, v8f16, SReg_128>;
-def : BitConvert <v2f64, v8f16, SReg_128>;
-
-def : BitConvert <v4i32, v8bf16, SReg_128>;
-def : BitConvert <v8bf16, v4i32, SReg_128>;
-def : BitConvert <v4i32, v8bf16, VReg_128>;
-def : BitConvert <v8bf16, v4i32, VReg_128>;
-
-def : BitConvert <v4f32, v8bf16, SReg_128>;
-def : BitConvert <v8bf16, v4f32, SReg_128>;
-def : BitConvert <v4f32, v8bf16, VReg_128>;
-def : BitConvert <v8bf16, v4f32, VReg_128>;
-
-def : BitConvert <v8i16, v8bf16, SReg_128>;
-def : BitConvert <v8bf16, v8i16, SReg_128>;
-def : BitConvert <v8i16, v8bf16, VReg_128>;
-def : BitConvert <v8bf16, v8i16, VReg_128>;
-
-def : BitConvert <v8f16, v8bf16, SReg_128>;
-def : BitConvert <v8bf16, v8f16, SReg_128>;
-def : BitConvert <v8f16, v8bf16, VReg_128>;
-def : BitConvert <v8bf16, v8f16, VReg_128>;
-
-def : BitConvert <v2f64, v8bf16, SReg_128>;
-def : BitConvert <v8bf16, v2f64, SReg_128>;
-def : BitConvert <v2f64, v8bf16, VReg_128>;
-def : BitConvert <v8bf16, v2f64, VReg_128>;
-
-def : BitConvert <v2i64, v8bf16, SReg_128>;
-def : BitConvert <v8bf16, v2i64, SReg_128>;
-def : BitConvert <v2i64, v8bf16, VReg_128>;
-def : BitConvert <v8bf16, v2i64, VReg_128>;
+foreach vt = VReg_128.RegTypes in {
+  foreach st = VReg_128.RegTypes in {
+    if !not(!eq (vt, st)) then {
+        def : BitConvert <vt, st, VReg_128>;
+    }
+  }
+}
+
+foreach vt = SReg_128.RegTypes in {
+  foreach st = SReg_128.RegTypes in {
+    if !not(!eq (vt, st)) then {
+        def : BitConvert <vt, st, SReg_128>;
+    }
+  }
+}
 
 
 // 160-bit bitcast
-def : BitConvert <v5i32, v5f32, SReg_160>;
-def : BitConvert <v5f32, v5i32, SReg_160>;
-def : BitConvert <v5i32, v5f32, VReg_160>;
-def : BitConvert <v5f32, v5i32, VReg_160>;
+foreach vt = VReg_160.RegTypes in {
+  foreach st = VReg_160.RegTypes in {
+    if !not(!eq (vt, st)) then {
+        def : BitConvert <vt, st, VReg_160>;
+    }
+  }
+}
+
+foreach vt = SReg_160.RegTypes in {
+  foreach st = SReg_160.RegTypes in {
+    if !not(!eq (vt, st)) then {
+        def : BitConvert <vt, st, SReg_160>;
+    }
+  }
+}
 
 // 192-bit bitcast
-def : BitConvert <v6i32, v6f32, SReg_192>;
-def : BitConvert <v6f32, v6i32, SReg_192>;
-def : BitConvert <v6i32, v6f32, VReg_192>;
-def : BitConvert <v6f32, v6i32, VReg_192>;
-def : BitConvert <v3i64, v3f64, VReg_192>;
-def : BitConvert <v3f64, v3i64, VReg_192>;
-def : BitConvert <v3i64, v6i32, VReg_192>;
-def : BitConvert <v3i64, v6f32, VReg_192>;
-def : BitConvert <v3f64, v6i32, VReg_192>;
-def : BitConvert <v3f64, v6f32, VReg_192>;
-def : BitConvert <v6i32, v3i64, VReg_192>;
-def : BitConvert <v6f32, v3i64, VReg_192>;
-def : BitConvert <v6i32, v3f64, VReg_192>;
-def : BitConvert <v6f32, v3f64, VReg_192>;
+foreach vt = VReg_192.RegTypes in {
+  foreach st = VReg_192.RegTypes in {
+    if !not(!eq (vt, st)) then {
+        def : BitConvert <vt, st, VReg_192>;
+    }
+  }
+}
+
+foreach vt = SReg_192.RegTypes in {
+  foreach st = SReg_192.RegTypes in {
+    if !not(!eq (vt, st)) then {
+        def : BitConvert <vt, st, SReg_192>;
+    }
+  }
+}
 
 // 224-bit bitcast
-def : BitConvert <v7i32, v7f32, SReg_224>;
-def : BitConvert <v7f32, v7i32, SReg_224>;
-def : BitConvert <v7i32, v7f32, VReg_224>;
-def : BitConvert <v7f32, v7i32, VReg_224>;
+foreach vt = VReg_224.RegTypes in {
+  foreach st = VReg_224.RegTypes in {
+    if !not(!eq (vt, st)) then {
+        def : BitConvert <vt, st, VReg_224>;
+    }
+  }
+}
 
-// 256-bit bitcast
-def : BitConvert <v8i32, v8f32, SReg_256>;
-def : BitConvert <v8f32, v8i32, SReg_256>;
-def : BitConvert <v8i32, v8f32, VReg_256>;
-def : BitConvert <v8f32, v8i32, VReg_256>;
-def : BitConvert <v4i64, v4f64, VReg_256>;
-def : BitConvert <v4f64, v4i64, VReg_256>;
-def : BitConvert <v4i64, v8i32, VReg_256>;
-def : BitConvert <v4i64, v8f32, VReg_256>;
-def : BitConvert <v4f64, v8i32, VReg_256>;
-def : BitConvert <v4f64, v8f32, VReg_256>;
-def : BitConvert <v8i32, v4i64, VReg_256>;
-def : BitConvert <v8f32, v4i64, VReg_256>;
-def : BitConvert <v8i32, v4f64, VReg_256>;
-def : BitConvert <v8f32, v4f64, VReg_256>;
-def : BitConvert <v16i16, v16f16, SReg_256>;
-def : BitConvert <v16f16, v16i16, SReg_256>;
-def : BitConvert <v16i16, v16f16, VReg_256>;
-def : BitConvert <v16f16, v16i16, VReg_256>;
-def : BitConvert <v16f16, v8i32, VReg_256>;
-def : BitConvert <v16i16, v8i32, VReg_256>;
-def : BitConvert <v16f16, v8f32, VReg_256>;
-def : BitConvert <v16i16, v8f32, VReg_256>;
-def : BitConvert <v8i32, v16f16, VReg_256>;
-def : BitConvert <v8i32, v16i16, VReg_256>;
-def : BitConvert <v8f32, v16f16, VReg_256>;
-def : BitConvert <v8f32, v16i16, VReg_256>;
-def : BitConvert <v16f16, v4i64, VReg_256>;
-def : BitConvert <v16i16, v4i64, VReg_256>;
-def : BitConvert <v16f16, v4f64, VReg_256>;
-def : BitConvert <v16i16, v4f64, VReg_256>;
-def : BitConvert <v4i64, v16f16, VReg_256>;
-def : BitConvert <v4i64, v16i16, VReg_256>;
-def : BitConvert <v4f64, v16f16, VReg_256>;
-def : BitConvert <v4f64, v16i16, VReg_256>;
-
-
-def : BitConvert <v8i32, v16bf16, VReg_256>;
-def : BitConvert <v16bf16, v8i32, VReg_256>;
-def : BitConvert <v8f32, v16bf16, VReg_256>;
-def : BitConvert <v16bf16, v8f32, VReg_256>;
-def : BitConvert <v4i64, v16bf16, VReg_256>;
-def : BitConvert <v16bf16, v4i64, VReg_256>;
-def : BitConvert <v4f64, v16bf16, VReg_256>;
-def : BitConvert <v16bf16, v4f64, VReg_256>;
-
-
-
-def : BitConvert <v16i16, v16bf16, SReg_256>;
-def : BitConvert <v16bf16, v16i16, SReg_256>;
-def : BitConvert <v16i16, v16bf16, VReg_256>;
-def : BitConvert <v16bf16, v16i16, VReg_256>;
-
-def : BitConvert <v16f16, v16bf16, SReg_256>;
-def : BitConvert <v16bf16, v16f16, SReg_256>;
-def : BitConvert <v16f16, v16bf16, VReg_256>;
-def : BitConvert <v16bf16, v16f16, VReg_256>;
+foreach vt = SReg_224.RegTypes in {
+  foreach st = SReg_224.RegTypes in {
+    if !not(!eq (vt, st)) then {
+        def : BitConvert <vt, st, SReg_224>;
+    }
+  }
+}
 
 
+// 256-bit bitcast
+foreach vt = VReg_256.RegTypes in {
+  foreach st = VReg_256.RegTypes in {
+    if !not(!eq (vt, st)) then {
+        def : BitConvert <vt, st, VReg_256>;
+    }
+  }
+}
+
+foreach vt = SReg_256.RegTypes in {
+  foreach st = SReg_256.RegTypes in {
+    if !not(!eq (vt, st)) then {
+        def : BitConvert <vt, st, SReg_256>;
+    }
+  }
+}
 
 
 // 288-bit bitcast
-def : BitConvert <v9i32, v9f32, SReg_288>;
-def : BitConvert <v9f32, v9i32, SReg_288>;
-def : BitConvert <v9i32, v9f32, VReg_288>;
-def : BitConvert <v9f32, v9i32, VReg_288>;
+foreach vt = VReg_288.RegTypes in {
+  foreach st = VReg_288.RegTypes in {
+    if !not(!eq (vt, st)) then {
+        def : BitConvert <vt, st, VReg_288>;
+    }
+  }
+}
+
+foreach vt = SReg_288.RegTypes in {
+  foreach st = SReg_288.RegTypes in {
+    if !not(!eq (vt, st)) then {
+        def : BitConvert <vt, st, SReg_288>;
+    }
+  }
+}
 
 // 320-bit bitcast
-def : BitConvert <v10i32, v10f32, SReg_320>;
-def : BitConvert <v10f32, v10i32, SReg_320>;
-def : BitConvert <v10i32, v10f32, VReg_320>;
-def : BitConvert <v10f32, v10i32, VReg_320>;
+foreach vt = VReg_320.RegTypes in {
+  foreach st = VReg_320.RegTypes in {
+    if !not(!eq (vt, st)) then {
+        def : BitConvert <vt, st, VReg_320>;
+    }
+  }
+}
+
+foreach vt = SReg_320.RegTypes in {
+  foreach st = SReg_320.RegTypes in {
+    if !not(!eq (vt, st)) then {
+        def : BitConvert <vt, st, SReg_320>;
+    }
+  }
+}
 
 // 320-bit bitcast
-def : BitConvert <v11i32, v11f32, SReg_352>;
-def : BitConvert <v11f32, v11i32, SReg_352>;
-def : BitConvert <v11i32, v11f32, VReg_352>;
-def : BitConvert <v11f32, v11i32, VReg_352>;
+foreach vt = VReg_352.RegTypes in {
+  foreach st = VReg_352.RegTypes in {
+    if !not(!eq (vt, st)) then {
+        def : BitConvert <vt, st, VReg_352>;
+    }
+  }
+}
+
+foreach vt = SReg_352.RegTypes in {
+  foreach st = SReg_352.RegTypes in {
+    if !not(!eq (vt, st)) then {
+        def : BitConvert <vt, st, SReg_352>;
+    }
+  }
+}
 
 // 384-bit bitcast
-def : BitConvert <v12i32, v12f32, SReg_384>;
-def : BitConvert <v12f32, v12i32, SReg_384>;
-def : BitConvert <v12i32, v12f32, VReg_384>;
-def : BitConvert <v12f32, v12i32, VReg_384>;
+foreach vt = VReg_384.RegTypes in {
+  foreach st = VReg_384.RegTypes in {
+    if !not(!eq (vt, st)) then {
+        def : BitConvert <vt, st, VReg_384>;
+    }
+  }
+}
+
+foreach vt = SReg_384.RegTypes in {
+  foreach st = SReg_384.RegTypes in {
+    if !not(!eq (vt, st)) then {
+        def : BitConvert <vt, st, SReg_384>;
+    }
+  }
+}
 
 // 512-bit bitcast
-def : BitConvert <v32f16, v32i16, VReg_512>;
-def : BitConvert <v32i16, v32f16, VReg_512>;
-def : BitConvert <v32f16, v16i32, VReg_512>;
-def : BitConvert <v32f16, v16f32, VReg_512>;
-def : BitConvert <v16f32, v32f16, VReg_512>;
-def : BitConvert <v16i32, v32f16, VReg_512>;
-def : BitConvert <v32i16, v16i32, VReg_512>;
-def : BitConvert <v32i16, v16f32, VReg_512>;
-def : BitConvert <v16f32, v32i16, VReg_512>;
-def : BitConvert <v16i32, v32i16, VReg_512>;
-def : BitConvert <v16i32, v16f32, VReg_512>;
-def : BitConvert <v16f32, v16i32, VReg_512>;
-def : BitConvert <v8i64,  v8f64,  VReg_512>;
-def : BitConvert <v8f64,  v8i64,  VReg_512>;
-def : BitConvert <v8i64,  v16i32, VReg_512>;
-def : BitConvert <v8f64,  v16i32, VReg_512>;
-def : BitConvert <v16i32, v8i64,  VReg_512>;
-def : BitConvert <v16i32, v8f64,  VReg_512>;
-def : BitConvert <v8i64,  v16f32, VReg_512>;
-def : BitConvert <v8f64,  v16f32, VReg_512>;
-def : BitConvert <v16f32, v8i64,  VReg_512>;
-def : BitConvert <v16f32, v8f64,  VReg_512>;
-def : BitConvert <v8i64,  v32f16, VReg_512>;
-def : BitConvert <v8i64,  v32i16, VReg_512>;
-def : BitConvert <v8f64,  v32f16, VReg_512>;
-def : BitConvert <v8f64,  v32i16, VReg_512>;
-def : BitConvert <v32f16, v8i64,  VReg_512>;
-def : BitConvert <v32f16, v8f64,  VReg_512>;
-def : BitConvert <v32i16, v8i64,  VReg_512>;
-def : BitConvert <v32i16, v8f64,  VReg_512>;
-
-
-def : BitConvert <v32bf16, v32i16, VReg_512>;
-def : BitConvert <v32i16, v32bf16, VReg_512>;
-def : BitConvert <v32bf16, v32i16, SReg_512>;
-def : BitConvert <v32i16, v32bf16, SReg_512>;
-
-def : BitConvert <v32bf16, v32f16, VReg_512>;
-def : BitConvert <v32f16, v32bf16, VReg_512>;
-def : BitConvert <v32bf16, v32f16, SReg_512>;
-def : BitConvert <v32f16, v32bf16, SReg_512>;
-
-def : BitConvert <v32bf16, v16i32, VReg_512>;
-def : BitConvert <v16i32, v32bf16, VReg_512>;
-def : BitConvert <v32bf16, v16i32, SReg_512>;
-def : BitConvert <v16i32, v32bf16, SReg_512>;
-
-def : BitConvert <v32bf16, v16f32, VReg_512>;
-def : BitConvert <v16f32, v32bf16, VReg_512>;
-def : BitConvert <v32bf16, v16f32, SReg_512>;
-def : BitConvert <v16f32, v32bf16, SReg_512>;
-
-def : BitConvert <v32bf16, v8f64, VReg_512>;
-def : BitConvert <v8f64, v32bf16, VReg_512>;
-def : BitConvert <v32bf16, v8f64, SReg_512>;
-def : BitConvert <v8f64, v32bf16, SReg_512>;
-
-def : BitConvert <v32bf16, v8i64, VReg_512>;
-def : BitConvert <v8i64, v32bf16, VReg_512>;
-def : BitConvert <v32bf16, v8i64, SReg_512>;
-def : BitConvert <v8i64, v32bf16, SReg_512>;
+foreach vt = VReg_512.RegTypes in {
+  foreach st = VReg_512.RegTypes in {
+    if !not(!eq (vt, st)) then {
+        def : BitConvert <vt, st, VReg_512>;
+    }
+  }
+}
+
+foreach vt = SReg_512.RegTypes in {
+  foreach st = SReg_512.RegTypes in {
+    if !not(!eq (vt, st)) then {
+        def : BitConvert <vt, st, SReg_512>;
+    }
+  }
+}
 
 // 1024-bit bitcast
-def : BitConvert <v32i32, v32f32, VReg_1024>;
-def : BitConvert <v32f32, v32i32, VReg_1024>;
-def : BitConvert <v16i64, v16f64, VReg_1024>;
-def : BitConvert <v16f64, v16i64, VReg_1024>;
-def : BitConvert <v16i64, v32i32, VReg_1024>;
-def : BitConvert <v32i32, v16i64, VReg_1024>;
-def : BitConvert <v16f64, v32f32, VReg_1024>;
-def : BitConvert <v32f32, v16f64, VReg_1024>;
-def : BitConvert <v16i64, v32f32, VReg_1024>;
-def : BitConvert <v32i32, v16f64, VReg_1024>;
-def : BitConvert <v16f64, v32i32, VReg_1024>;
-def : BitConvert <v32f32, v16i64, VReg_1024>;
+foreach vt = VReg_1024.RegTypes in {
+  foreach st = VReg_1024.RegTypes in {
+    if !not(!eq (vt, st)) then {
+        def : BitConvert <vt, st, VReg_1024>;
+    }
+  }
+}
+
+foreach vt = SReg_1024.RegTypes in {
+  foreach st = SReg_1024.RegTypes in {
+    if !not(!eq (vt, st)) then {
+        def : BitConvert <vt, st, SReg_1024>;
+    }
+  }
+}
 
 
 /********** =================== **********/
diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.td b/llvm/lib/Target/AMDGPU/SIRegisterInfo.td
index 35c7b393a8ca4..6b54764a876ca 100644
--- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.td
+++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.td
@@ -547,8 +547,12 @@ class RegisterTypes<list<ValueType> reg_types> {
 }
 
 def Reg16Types : RegisterTypes<[i16, f16, bf16]>;
-def Reg32Types : RegisterTypes<[i32, f32, v2i16, v2f16, v2bf16, p2, p3, p5, p6]>;
-def Reg64Types : RegisterTypes<[i64, f64, v2i32, v2f32, p0, p1, p4, v4i16, v4f16, v4bf16]>;
+def Reg32DataTypes: RegisterTypes<[i32, f32, v2i16, v2f16, v2bf16]>;
+def Reg32PtrTypes: RegisterTypes<[p2, p3, p5, p6]>;
+def Reg32Types : RegisterTypes<!listconcat(Reg32DataTypes.types, Reg32PtrTypes.types)>;
+def Reg64DataTypes: RegisterTypes<[i64, f64, v2i32, v2f32, v4i16, v4f16, v4bf16]>;
+def Reg64PtrTypes: RegisterTypes<[p0, p1, p4]>;
+def Reg64Types : RegisterTypes<!listconcat(Reg64DataTypes.types, Reg64PtrTypes.types)>;
 def Reg96Types : RegisterTypes<[v3i32, v3f32]>;
 def Reg128Types : RegisterTypes<[v4i32, v4f32, v2i64, v2f64, v8i16, v8f16, v8bf16]>;
 
@@ -940,8 +944,7 @@ multiclass VRegClass<int numRegs, list<ValueType> regTypes, dag regList> {
   }
 }
 
-defm VReg_64 : VRegClass<2, [i64, f64, v2i32, v2f32, v4f16, v4bf16, v4i16, p0, p1, p4],
-                                (add VGPR_64)>;
+defm VReg_64 : VRegClass<2, Reg64Types.types, (add VGPR_64)>;
 defm VReg_96 : VRegClass<3, Reg96Types.types, (add VGPR_96)>;
 defm VReg_128 : VRegClass<4, Reg128Types.types, (add VGPR_128)>;
 defm VReg_160 : VRegClass<5, [v5i32, v5f32], (add VGPR_160)>;
diff --git a/llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll b/llv...
[truncated]

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be no test changes. Splitting the tests by type should be done separately

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And should still drop the test changes, and leave this as an NFC refactoring only

@Shoreshen
Copy link
Contributor Author

And should still drop the test changes, and leave this as an NFC refactoring only

Hi @arsenm , please see this PR: #133052

All tests passed and reasons are listed.. Thanks

Shoreshen added a commit that referenced this pull request Apr 8, 2025
This PR add test cases for all types of bit conversation, it prepares
for PR: #132899



All tests passed due to:
1. For DAG, pattern will not separate SReg and VReg. One of the sample
is:
    ```
define <2 x double> @v_bitcast_v4f32_to_v2f64(<4 x float> inreg %a, i32
%b) {
     %cmp = icmp eq i32 %b, 0
     br i1 %cmp, label %cmp.true, label %cmp.false
   
   cmp.true:
     %a1 = fadd <4 x float> %a, splat (float 1.000000e+00)
     %a2 = bitcast <4 x float> %a1 to <2 x double>
     br label %end
   
   cmp.false:
     %a3 = bitcast <4 x float> %a to <2 x double>
     br label %end
   
   end:
     %phi = phi <2 x double> [ %a2, %cmp.true ], [ %a3, %cmp.false ]
     ret <2 x double> %phi
   }
   ```
It suppose to select from scalar register patterns. But the Vreg pattern
is matched is as follow:
    ```
   Debug log:
   ISEL: Starting selection on root node: t3: v2f64 = bitcast t2
   ISEL: Starting pattern match
     Initial Opcode index to 440336
Skipped scope entry (due to false predicate) at index 440339, continuing
at 440367
Skipped scope entry (due to false predicate) at index 440368, continuing
at 440396
Skipped scope entry (due to false predicate) at index 440397, continuing
at 440435
Skipped scope entry (due to false predicate) at index 440436, continuing
at 440467
Skipped scope entry (due to false predicate) at index 440468, continuing
at 440499
Skipped scope entry (due to false predicate) at index 440500, continuing
at 440552
Skipped scope entry (due to false predicate) at index 440553, continuing
at 440587
Skipped scope entry (due to false predicate) at index 440588, continuing
at 440622
Skipped scope entry (due to false predicate) at index 440623, continuing
at 440657
Skipped scope entry (due to false predicate) at index 440658, continuing
at 440692
Skipped scope entry (due to false predicate) at index 440693, continuing
at 440727
Skipped scope entry (due to false predicate) at index 440728, continuing
at 440769
Skipped scope entry (due to false predicate) at index 440770, continuing
at 440798
Skipped scope entry (due to false predicate) at index 440799, continuing
at 440836
Skipped scope entry (due to false predicate) at index 440837, continuing
at 440870
     TypeSwitch[v2f64] from 440873 to 440892
   
   Patterns:
   /*440892*/    OPC_CompleteMatch, 1, 0, 
// Src: (bitconvert:{ *:[v2f64] } VReg_128:{ *:[v4f32] }:$src0) -
Complexity = 3
                  // Dst: VReg_128:{ *:[v2f64] }:$src0
    ```
2. Global isel will use `Select_COPY` to select bitcast
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Apr 8, 2025
…052)

This PR add test cases for all types of bit conversation, it prepares
for PR: llvm/llvm-project#132899

All tests passed due to:
1. For DAG, pattern will not separate SReg and VReg. One of the sample
is:
    ```
define <2 x double> @v_bitcast_v4f32_to_v2f64(<4 x float> inreg %a, i32
%b) {
     %cmp = icmp eq i32 %b, 0
     br i1 %cmp, label %cmp.true, label %cmp.false

   cmp.true:
     %a1 = fadd <4 x float> %a, splat (float 1.000000e+00)
     %a2 = bitcast <4 x float> %a1 to <2 x double>
     br label %end

   cmp.false:
     %a3 = bitcast <4 x float> %a to <2 x double>
     br label %end

   end:
     %phi = phi <2 x double> [ %a2, %cmp.true ], [ %a3, %cmp.false ]
     ret <2 x double> %phi
   }
   ```
It suppose to select from scalar register patterns. But the Vreg pattern
is matched is as follow:
    ```
   Debug log:
   ISEL: Starting selection on root node: t3: v2f64 = bitcast t2
   ISEL: Starting pattern match
     Initial Opcode index to 440336
Skipped scope entry (due to false predicate) at index 440339, continuing
at 440367
Skipped scope entry (due to false predicate) at index 440368, continuing
at 440396
Skipped scope entry (due to false predicate) at index 440397, continuing
at 440435
Skipped scope entry (due to false predicate) at index 440436, continuing
at 440467
Skipped scope entry (due to false predicate) at index 440468, continuing
at 440499
Skipped scope entry (due to false predicate) at index 440500, continuing
at 440552
Skipped scope entry (due to false predicate) at index 440553, continuing
at 440587
Skipped scope entry (due to false predicate) at index 440588, continuing
at 440622
Skipped scope entry (due to false predicate) at index 440623, continuing
at 440657
Skipped scope entry (due to false predicate) at index 440658, continuing
at 440692
Skipped scope entry (due to false predicate) at index 440693, continuing
at 440727
Skipped scope entry (due to false predicate) at index 440728, continuing
at 440769
Skipped scope entry (due to false predicate) at index 440770, continuing
at 440798
Skipped scope entry (due to false predicate) at index 440799, continuing
at 440836
Skipped scope entry (due to false predicate) at index 440837, continuing
at 440870
     TypeSwitch[v2f64] from 440873 to 440892

   Patterns:
   /*440892*/    OPC_CompleteMatch, 1, 0,
// Src: (bitconvert:{ *:[v2f64] } VReg_128:{ *:[v4f32] }:$src0) -
Complexity = 3
                  // Dst: VReg_128:{ *:[v2f64] }:$src0
    ```
2. Global isel will use `Select_COPY` to select bitcast
Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This patch should still have no test changes in it

@Shoreshen
Copy link
Contributor Author

Shoreshen commented Apr 15, 2025

This patch should still have no test changes in it

Hi @arsenm , please see this PR:#135729 for the tests change only, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants